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nm PRODUCTION IN PTPWTft PASTORIS 


Field of the Invention 

This invention relates to the field of 
recombinant DNA technology. More particularly, the 
invention concerns the development of Pichia pastoris 
yeast strains capable of high-level production and 
secretion of at least a portion of the human T-cell 
receptor molecule CD4 (also referred to as T4 protein) 
containing the site of interaction between CD4 and the 
human immunodeficiency virus HIV. 


B^tgronBd of invention 

The CD4 protein is a glycoprotein of 
approximately 60,000 daltons molecular weight that is 
expressed on the cell membrane of the mature, thymus- 
15 derived (T) lymphocytes, and to a lesser extent on cells 
of the monocyte/macrophage lineage. The CD4 molecule 
consists of four tandem extracellular domains which 
contain significant sequence and structural homology with 
the variable (V) and joining (J) regions of 
20 immunoglobulin gene family members, a single membrane- 
spanning domain, and a carboxy- terminal cytoplasmic 
segment. 

The molecule was originally described as a 
marker distinguishing the helper/ inducer subset of mature 
25 T lymphocytes [Reinherz et al A , Cell 19_, 821 (1980) ; 

Goldstein et al. , .Immunol. Review 68, 5 (1982)], and is 
known to be involved in the interaction of these cells 
. with components of the immune system that express class 
II major histocompatibility complex (MHC) antigen 
3J> molecules [see, for example, Swain, Tmmnnol , Review ZA, 
129 (1983) f Gay et al.. Nature 323_, 626 (1987) ; Doyle 
et al. . Nature 330, 256 (1987)]. 

The isolation and nucleotide sequence of a eDNA 
encoding the CD4 surface glycoprotein was first reported 
35 by Maddon et al. (Columbia University), Ceil 42, 93 
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(1985) and is disclosed in the PCT Patent Application 
Publication No. WO 88/01304. However, the published 
sequence proved to be incorrect at its N-terminus due to 
a sequencing error in which the AAG codon (nucleotides 
5 151-153 in the cDNA clone pT4B) was reported as AAC# 
Accordingly, the authors originally predicted an 
asparagine (asn) at the +3 position. Subsequently, 
Richard Axel's group at Columbia University resequenced 
the pT4B cDNA and sequenced three cDKAs from different 
1£ libraries, as well as genomic clones encoding CD4. They 

have found that CD4 actually contains lysine (lys) at the 
asn assignment, and that the residue designated 
originally as +3 is, in fact, the amino-terminal residue 
[Littman et ai. in Cell 55, 541 (1988)]. 
15 of immediate interest, is the finding that the 

human CD4 protein binds the human immunodeficiency virus 
(HIV), the causative agent of AIDS? and it is believed 
that the HIV virus gains entry to the cells through 
interaction with the CD4 "receptor' 1 . Amongst the 
20 earliest publications concerning the interaction of the 
CD4 molecule and the HIV virus are, for examples 
Dalgleish et al. , Nature 312, 763 (1984); Klatzman 
et al. , Nature 312 , 767 (1984); McDougal et_&LL# 
J. Immunol. 135 , 3151 (1985) ; and McDougal et al,.,, 
25 Science 23£, 382 (1986) . 

Recent reports have described trans feet ion of 
CD4" cells with CD4-encoding DNA, and the subsequent newly 
acquired ability of transformed cells to bind HIV and 
become infected* Thus, Maddon et al. . Ce ll 47, 333 
30 (1986) described the recombinant expression of the CD4 

(T4) gene in human lymphoid and epithelial cells, and the 
ability of previously T4" cells to bind to and become 
infected with HIV, after they had been transformed with 
recombinant vectors and thereby became T4* cells* The 
35 authors also showed that recombinant human CD4 on mouse 
cells did not allow for HIV infection, although it bound 
HIV* 
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Further publications concerning the recombinant 
production of soluble CD4 protein, and the ability of the 
recombinant product to interact with HIV and thus inhibit 
infection are: Smith ft al f ., science 2M, 1™* < 1987 > J 
5 Fisher et al. . Nature 331, 76 (1988) ; Hussey et^U, 
Nature 331, 78 (1988); Deen e t aJU, M atur e m, 82 
(1988); and Traunecker et al.,, Nature 331. 84 (1988). A 
concise review on the subject of HIV/CD4 interaction has, 
for example, been published by Q. J. Sattentau and R. A. 

10 Weiss in Cell 52, 631 (1988). 

Smith et al. . Supra produced soluble, secreted 
forms of the CD4 antigen molecule by trans feet ion of 
mammalian (CHO) cells with vectors encoding truncated 
versions of CD4, in which the transmembrane and 

15 cytoplasmic domains were replaced with. a. short linker 

sequence containing an in-frame stop codon. The authors 
worked under the assumption that the deduced amino acid 
sequence of CD4 as originally published by Maddon et al.t., 
was correct. 

20 Fisher et al. . Supra constructed three 

truncated CD4 genes that lacked the transmembrane and 
cytoplasmic domains, and produced recombinant soluble CD4 
protein in dihydrof olate reductase (DHFR) ^mutant CHO 
cells. The authors discovered the discrepancy in the CD4 
25 amino acid sequence when they sequenced their own cDNA, 
but attributed it to a possible allelic polymorphism and 
chemically changed the AAG codon to an AAC codon to 
obtain a CD4 protein sequence "identical to that 
previously reported" (page 331). 
30 To produce a secreted form of CD4, Hussey 

et al. , Supra report the expression of truncated CD4 gene 
in Spodoptera JDagiperda (SF) cells, using a baculovirus 
(AcNPV) expression system. Milligram quantities of a 
hydrophilic extracellular segment of CD4 were generated. 

Deen et al. . Supra described an expression 
system in which a recombinant, soluble form of CD4 was 
secreted into tissue culture supernatants, Supernatants 


31 
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from clones were monitored for the expression of soluble 
CD4, among others, by Western blot analysis using a 
rabbit anti-CD4 polyclonal antibody "developed against a 
denatured CD4 protein produced in bacteria" {page 82) . 
5 Traunecker et al. . Supra report the production 

and secretion of two soluble chimeric CD4 proteins from 
myeloma cells. The secreted proteins retained "at least 
some of their original conformation" (page 84) . 

The specific sequences of CD4 and the HIV virus 
10 that are required for interaction have also been 
identified. 

The component of HIV that mediates binding of 
the virus to CD4 is the surface glycoprotein, gpl20 
[Lasky et al. . Cell 50, 975 (1987)]. To date, antibodies 
15 raised against gpl20 have been ineffective in blocking 

viral infection either in vitro or in vivo. The inability 
to block is probably related to the heterogeneity seen 
among gp!20 protein sequences from different viral 
isolates. Antibodies raised against gpl20 from one HIV 
2_0 isolate will not necessarily recognize gpl20 from a 

different isolate. Also, the CD4-binding region of gpl20 
is not accessible to antibody molecules and thus may be 
capable of binding CD4 even if antibody does bind gp!20. 

Studies using monoclonal antibodies to CD4 have 
25 identified the first variable region, V^ comprising the 
N-terminal 106 amino acids of mature CD4, as the site of 
interaction with gpl20 [Berger et al., PNAS 85 , 2357 
(1988)]. Further analyses of binding, using mutant CD4 
proteins, truncated derivatives of CD4, HIV, and purified 
30 gpl20 have narrowed this assignment to amino acid 

residues 40-48 within V, [Peterson and Seed, Cell 54., 65 
(1988) ] . However, other conserved structures within V, 
are probably essential to achieve the highest affinity 
binding of HIV to CD4. 
35 The affinity of the HIV virus for CD4 makes 

this molecule a rational target for development of an 
effective AIDS therapy or prevention. It might be 
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possible to block the entry of HIV into CD4 -expressing T- 
cells through the use of anti-CD4 antibodies or the 
presence of excess soluble CD4 molecules. In the case of 
antibodies, the CD4 receptor present on the T-cell 
surface may be unable to bind the HIV gpl20 "ligand" if 
the CD4 is first bound by antibody. Alternatively, if an 
excess of soluble CD4 molecules is present in a sample 
comprised of CD4 -expressing T-cells and HIV, then a large 
proportion of virus might bind to the soluble CD4 , be 
inhibited from binding the cell-associated receptor, and 
viral infection might be lessened or prohibited. 

Chao eiLJLU, J^iol.^Cheiiu M±, 5812 (1989) 
expressed a gene encoding a 113-amino acid, NH 2 -terminal 
fragment of CD4 (rsT4.113) in coll under the control 
15 of the Ei. coli tryptophan operon promoter. An insoluble 
product that is found in inclusion bodies, is obtained at 
5 to 10% of total protein, the purification of which 
provides the recombinant peptide at less than 20% of the 
starting material. The product, unlike the naturally 
20 occurring CD4 contains an unblocked N-terminal methionine 
group. 

In view of the promising therapeutic results, 
there is a great need for a recombinant expression system 
that is suitable for the efficient, large-scale 

25 production of a soluble, authentic form of the CD4 

protein, that, after purification, is suitable for use in 
possible AIDS therapies and preventive measures. 

The Pichia pastoris yeast expression system, 
developed in part by scientists at SIBIA, the assignee of 

M the present patent application, has proved to be 

instrumental in the production of several heterologous 
proteins. This system is based on methanol-regulated 
promoters and high cell density fermentation. Because 
Pichia pastoris is a methylotrophic yeast, it has 

35 metabolic pathways that respond to and regulate methanol 
utilization. A key enzyme in the methanol utilization 
pathway is alcohol oxidase, a protein encoded by two 
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genes, AOXl and AQX2 . When Pichia cells are grown in the 
presence of methanol, the AOXl and AOX2 genes are 
transcribed and a large amount of alcohol oxidase protein 
is produced. The high level of AQX gene expression is 
S mainly due to the methanol-responsive AOXl gene promoter 
which is activated in the presence of methanol. This 
promoter is highly expressed and tightly regulated (see 
e.g. the European Patent Application No. 85113737.2, 
published June 4, 1986, under No. 183 071). After 
10 identification and isolation of the AOXl regulatory 

elements, a methanol-responsive gene expression system 
has been developed in Pichia that places heterologous 
genes under the regulation of the AOXl promoter [Cregg 
et al. , Bio/Technology 5. 479 (1987)]. Another key 
15 feature of the pastoris expression system is the 

stable integration of expression cassettes into the P^ 
pastoris genome, thus significantly decreasing the chance 
of vector loss. 

Although Ea. pastoris has been used successfully 
Z$L for the production of various heterologous proteins, 

e.g., hepatitis B surface antigen [Cregg et al„.. Supra ] , 
bovine lysozyme [Digan et al. . Developments in Industrial 
Microbiology 29, 59 (1988); Digan et al. ,, Bio/Technology 
7, 160 (1989), and Saccharomvces cerevisiae invertase 
25 [Tschopp et al.. Bio/Technology 5, 1305 (1987)], 

endeavors to produce other heterologous gene products in 
Pichia , especially by secretion, have given mixed results 
and, in some cases, have been unsuccessful. At our 
present level of understanding of the P^ pastoris 
30 expression system, it is unpredictable whether a given 
gene can be expressed to an appreciable level in this 
yeast, whether the expression yields a product that is 
stable under ordinary fermentation conditions and 
subsequent processing, or whether Pichia will tolerate 
35 the presence of the recombinant gene product in its 

cells. Further, it is especially difficult to foresee if 
a particular protein will be secreted by P^ pastoris, and 


WO 91/05057 


PCT/US90/0552© 


5 


if it is, at what efficiency. Even for fc. ceresjisiae, 
which has been considerably more extensively studied than 
£, Eas£oris, the mechanism of protein secretion is not 
well defined and understood. 


The present invention relates to the production 
of a secreted soluble form of CD4 protein, containing the 
site of interaction between CD4 and the human 
10 immunodeficiency virus (HIV) in EicMa pas£orls 

(Pi. pastoris ) . 

in one aspect, the present invention relates to 

a P, pastoris yeast cell containing in its genome at 
least one copy of a DNA sequence operably encoding in 

15 Et pastoris at least a portion of human CD4 glycoprotein, 
containing the site of interaction between CD4 and HIV, 
in operational association with a DNA sequence encoding a 
signal sequence which functions to direct secretion of 
the encoded glycoprotein in £, pasioris , both under the 

£0 regulation of a promoter region of a P^ pasfcorls gene. 
The signal sequence of the 8*. cerevisjas alpha-mating 
factor (AMF) gene (AMF pre-pro sequence) is a preferred 

signal sequence. 

in another aspect, the present invention 

25 concerns a DNA fragment which may be contained within, or 
may itself be, a circular plasmid, and which comprises at 
least one copy of an expression cassette comprising in 
the direction of transcription, a promoter region of a 
first £t pasfeoris gene, a DNA sequence encoding in 

30 P, pastpris at least a portion of human CD4 glycoprotein 
containing the site of interaction between CD4 and the 
HIV virus, preceeded by a DNA sequence encoding a signal 
sequence directing the secretion of said glycoprotein or 
a portion thereof in £*. pastoris, and a transcription 

35 terminator of a second E*. pastorM gene, said first and 
second P,. pastoris genes being identical or different, 
and the segments of said expression cassette being in 
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operational association. The DNA sequence preceeding the 
CD4 glycoprotein gene preferably is a DNA sequence 
encoding the rarevisiae AMF pre-pro sequence followed 
by a DNA sequence encoding AMF processing site lys-arg. 
5 Expression vectors containing such DNA 

sequences are also within the scope of the invention. 

In a further aspect, the invention relates to a 
process for producing and secreting at least a portion of 
human CD4 glycoprotein, containing the site of 
10 interaction between CD4 and the virus HIV, into the 

culture medium. According to this process, pastoris 
transformants containing in their genome at least one 
copy of a DNA sequence operably encoding in K pastoris 
at least a portion of human CD4 glycoprotein, containing 
the site of interaction between CD4 and the virus HIV, 
in operational association with a DNA sequence encoding a 
signal sequence which functions to direct secretion of 
the encoded CD4 or CD4 portion in E*, pastoris (the 

nerevisiae AMF pre-pro sequence being preferred) , both 
20 under the regulation of a promoter region of a 

Pi. pastoris gene, are grown under conditions allowing the 
expression of the DNA sequences in Ps. pastoris and 
secretion of the CD4 glycoprotein into the culture medium 
in a substantially pure form devoid of degradation 
2J5 products. 

Brief Descr i ption of Drawings 

Figure 1 shows the nucleotide sequence and 
amino acid sequence of the cerevisiae alpha-mating 
30 factor (AMF) pre-pro gene segment. 

Figure 2 shows the nucleotide sequence and the 
deduced amino acid sequence of a 482 bp DNA fragment 
encoding amino acids 1 - 106 of mature CD4 along with its 
leader sequence. 
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Figure 3 illustrates the construction of the 
Pichia pastoris expression vector, P SCD103 for the 
production of human CD4-V 1 . 

Figure 4 shows the nucleotide and amino acid 

sequence of the EcoRI insert of pSCD103. 

Figure 5 is a restriction map of plasmid pA0815 
Figure 6 shows the cell wet weight over time 
for fermentation Runs 568, 570, and 571. 

Figure 7 shows the time course of fermentation 
Run 593 of the two-copy Mut+ strain, G+SCD103S16. 

A. Cell density (grams of wet weight/ 1 iter ) , 
plotted against time of fermentation. 

B. Recombinant human CD4-V 1 production 
(mg/liter of cell-free fermentor broth) 
for the fermentation presented in Figure 
7 A is plotted against time. The 
expression level was determined by the 
quantitative Western blot assay. 

Figure 8 is a silver-stained gel, using 
reducing conditions of V, standard and Eisbift Eastorlfe 
fermentor broth. Lanes 1-4 (numbered consecutively from 
left to right> contain 100, 200, 300 or 400 ng V, 
standard, respectively. Lane 6 is 7.5 tffc of G+SCD103S16 
fermentor broth; lanes 7 and 8 are 5 *1 of G-PA0815 
25 fermentor broth containing 100 ng or 200 ng V, standard, 
~ respectively. Lane 10 is 3.75 Ml of G + SCD1 P 3S16 broth 

with 3.75 nl of G-PA0815 broth? lanes 11 and 12 are 3.75 
M l and 7.5 Ml of G-FA0815 broth, alone. Lane 13 is pre- 
stained molecular size standards obtained from 
30 Diversified Biotech, Newton Centre, MA. They are Low 
Range Standards #SDS-100P and are: trypsin inhibitor, 
20,400? myoglobin, 16,949; CNBr cleavage fragments of 
myoglobin, Fragment IV, 14,404; Fragment III, 8,159; 
Fragment II, 6,214; Fragment I 2,152. These standards 
35 are included to denote relative positions of sample bands 
on the gel and have not been used to estimate molecular 
Lanes 5 and 9 contain broth from a fermentation 
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mass. 
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run of strain G+SCD103S16 using slightly different 
conditions than those reported in the Examples. 

Figure 9 shows the results of N-terminal 
sequence analysis of rCD4-V 1 produced in Pichia gastQSiS 
5 compared with the published N-terminal sequence for 

mature human CD4. Pichia XTCD4-V, was isolated from a gel 
similar to the one in Figure 8, and sequenced as 
described in Example 4b* 

Figure 10 shows the result of stability test 
10 performed on rCD4-V 1 produced in Pichia pastoris. One 
hundred microliter samples of G+SCD103S16 broth were 
treated under the following conditions, before 10 /ii was 
separated on SDS-PAGE and subjected to immunoblotting 
with polyclonal sCD4 antibody. All broth samples were 
15 frozen immediately upon removal from the fermentor. The 

sample in lane 1 was thawed just prior to SDS-PAGE. 
Lanes 2-4 contain samples at pH 2.5, lanes 5-7 contain 
samples that had been adjusted to pH 5*0 upon thawing. 
Samples in: lanes 2 and 5 were thawed and immediately 
2Q refrozen, lanes 3 and 6 were thawed and incubated at 4" 
for 20 hours, lanes 4 and 7 were thawed and incubated at 
30 e C for 20 hours • E^ coli V 1 (100 ng) was included as a 
size standard* 

Figure 11 shows the nucleotide sequence of the 
25 human CD4 cDNA and the translated sequence of the CD4 
protein* 

Detailed Description of the Invention 
1* Definitions 

30 An expression system suitable for the 

production and secretion of at least a portion of human 
CD4 glycoprotein, containing the site of interaction 
between CD4 and the human immunodeficiency virus HIV is 
provided. Preferably, this portion is the region of 

35 CD4, 
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It will be understood that there is some 
uncertainty in the literature as to the definition of the 
"first variable region" (V, region) of CD4, often 
referred to as -CM-V,". Studies using recombinant HIV 
^120 demonstrate that the determinants for high affinxty 
binding lie solely within the first 106 amino acxds of 
CD4. However, the "recombinant V,- produced in £, coli 
contains the first 113 N-terminal amino acids of-tar. 
human CD4 . The terms "first variable regxon", "V, or V, 
10 region" and synonymous expressions, alone or xn 

combination with other terms, are used throughout the 
specification and claims to refer to a DNA sequence 
including at least the first 106 N-terminal amino acxds 
of mature human CD4 (as shown in Figure 11). However 
polypeptides deficient in one or more amino acids xn the 
amino acid sequence reported in the literature, or 
polypeptides containing additional amino acids, or 
polypeptides in which one or more amino acids xn the 

~* v -rpcrion of CD4 are replaced 
amino acid sequence of the V, regxon or 

by other amino acids, are within the scope of the 

definition used herein, provided that they exhibxt the 

functional activity of V,, in particular preserve its HIV- 

binding properties. The definition used in connection 

with the present invention is intended to embrace all the 

allelic variations of V,. Moreover, as noted Supra, 

derivatives obtained by simple modification of the amxno 

acid sequence of the naturally occurring product, e.g. by 

way of site-directed mutagenesis or other standard 

procedures are included. 

The term "at least a portion of human CD4 
glycoprotein, containing the site of interaction between 
CD4 and the human immunodeficiency virus HIV" and 
synonymous expressions, as used herein, refer to the 
full-length mature human CD4 glycoprotein molecule or any 
35 portion thereof capable of binding the HIV virus. Just 
~ as in the case of the V, region of CD4, polypeptides - 

deficient in one or more amino acids in the correct amxno 
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acid sequence reported in the literature for mature human 
naturally occurring CD4 or its respective regions, or 
polypeptides containing additional amino acids, or 
polypeptides in which one or more amino acids in the 
5 amino acid sequence of CD4 or its respective regions are 
replaced by other amino acids, are within the scope of 
the definition used herein, provided that they exhibit 
the functional activity of CD4, in particular preserve 
its HIV-binding properties. The definition used in 
10 connection with the present invention is intended to 

embrace all the allelic variations of CD4 * Moreover, as 
noted Supra, derivatives obtained by simple modification 
of the amino acid sequence of the naturally occurring 
protein, e.g. by way of site-directed mutagenesis or 
15 other standard procedures are included* 

The term "amino acid sequence operably encoding 
in Pichia pastoris at least a portion of human CD4 
glycoprotein, containing the site of interaction between 
CD4 and the human immunodeficiency virus HIV" and 
20 grammatical variations thereof, as used herein, refers to 
DNA sequences encoding in Pichia pastpris "at least a 
portion of human CD4 glycoprotein, containing the site of 
interaction between CD4 and the human immunodeficiency 
virus HIV" such as the "first variable (V t ) region", as 
25 hereinabove defined. This sequence contains but is not 
restricted to, the DNA sequence encoding residues 16 
through 84 of the mature CD4 protein, which are contained 
within the first disulf ide-bonded, covalently closed 
peptide loop of CD4 * Such sequences may be obtained by 
30 chemical synthesis or by transcription of a messenger RNA 
(mKNA) corresponding to CD4 or a portion thereof to a 
complementary DNA (cDNA) and converting the latter into a 
double stranded cDNA. Additionally, the CD4 sequences 
may be obtained through the use of polymerase chain 
35 reaction {PCR) on genomic DNA encoding at least the V t 
region. The mRNA can be isolated for example, from T4+ 
transformed fibroplasts as described by Maddon et al, f , 
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^ (1985) . chemical synthesis of a gene for human CM 
or a portion thereof is, for example, disclosed by 
Jameson et_^, me 240, 1335 (1988); Litsen et_^. 
Ml, 712 (1988). The requisite DN A sequence can 
also he removed, for example, by restrxctxon enzyme 
digest of known vectors harboring the desired gene. 
Examples of such vectors and the means for thexr 
preparation can be taken from the following publicatxons: 
smith et^U, *»! Fisher tUU, ^ ~ 
10 » oeen £t_^, Traunecker AJl.rlWir etc. 

~ According to Example 1 of the present applxcatxon a 482 
bp DNA fragment encoding the V, portion of human CD4 was 
excised from a 2.2 kb linear flaUHto*! »NA tr^nt by 
digestion with EcoKI and Xbal. However, the CD4-V, 
M encoding DNA fragment can be removed from other, known DNA 
fragments as well. For example, a 682 bp BcoPI-ffltfl 
fragment from P T4B is disclosed in Maddon et^U, Sell 
42 93 (1985) . The ^-encoding sequence can be readily 
Stained from this fragment by digestion with isoK* and 
20 Xbal * 

The amino acids, which occur in the varxous 
amino acid sequences referred to in the specification 
have their usual, three- and one-letter abbrevxatxons , 
routinely used in the art, i.e. : 

AjdjjoJ^d ^reviatxon 


15 


30 


L-Alanine 


Ala A 


Asn N 
Asp D 


L-Arginine **9 R 

L-Asparagine 

L-Aspartic acid 

L-cysteine Cy s c 

L-Glutamine Gln G 

Glu E 

Gly G 

His H 


L-Glutamic Acid 
L-Glycine 
L-Histidine 


35 L-isoleucine Ile 1 


L- Leucine 


Leu L 


L-Lysine L Y S K 
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L-Methionine 

Met 


L-Phenylalanine 

Phe 

F 

L-Proline 

Pro 

P 

I> Serine 

Ser 

S 

L-Threonine 

Thr 

T 

L~Tryptophan 

Trp 

W 

L-Tyrosine 

Tyr 

Y 

L~ Valine 

Val 

V 


10 The promoter region employed to drive the 

expression of a gene encoding at least a portion of CD4, 
preferably CD4-V ir is derived from a methanol -regulated 
alcohol oxidase gene of P^ pastoris * P*_ pastoris is 
known to contain two functional alcohol oxidase genes: 

15 alcohol oxidase I (A0X1) and alcohol oxidase II (A0X2) 
genes „ The coding portions of the two &QX genes are 
closely homologous at the DNA and predicted amino acid 
sequence levels and share common restriction sites. The 
proteins expressed from the two genes have similar 

20 enzymatic properties but the promoter of the £0X1 gene is 
more efficient and highly expressed, therefore, its use 
is preferred for heterologous expression* The AQXl gene, 
including its promoter, has been isolated and thoroughly 
characterized [Ellis et al» , Mol. Cell. Biol. 5, 1111 

25 (1985)]. 

The expression cassette used for transforming 
Pi. pastoris cells contains, in addition to the 
£^ pastoris promoter and the CD4 (CD4-V 1 ) encoding DNA 
sequence, a DNA sequence encoding a signal sequence 
30 directing the secretion of the CD4 glycoprotein or a 

portion thereof in P^ pastoris , preferably a DNA encoding 
the in^reading frame $± cerevisiae AMF pre-pro sequence, 
and a DNA sequence encoding AMF processing site, lys-arg 
(also referred to as lys~arg encoding sequence) and a 
35 P^ pastoris transcription terminator. Although the 

cerevisiae AMF pre~pro sequence is preferred, other 
signal sequences suitable for directing foreign protein 
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secretion in Eastoris may also be used. Such 
sequences are, for example, the S^jaroioy^s cerevjyaae 
invertase signal sequence. 

The S^. ^r-evisiae alpha-mating factor is a 13- 
5 residue peptide, secreted by cells of the "alpha" mating 
type, that acts on cells of the opposite "a" mating type 
to promote efficient conjugation between the two cell 
types and thereby formation of «a-alpha'« diploid cells 
[Thorner et al. , TheJTmrrnlRr Biology the Yeast 
m sac^haromyce^, Cold Spring Harbor Laboratory, Cold Spring 
Harbor, NY, 143 (1981)]. The AMF pre-pro sequence 
leader sequence contained in the AMF precursor molecule, 
which, together with the lys-arg encoding sequence rs 
necessary for proteolytic processing and secretion (see 
M e.g. Brake et al.. , Supra). The AMF pre-pro sequence, 
including the lys-arg encoding sequence is a 255 bp 
fragment which is illustrated in Figure 1. 

The pastoris transcription terminator used 
in accordance with the present invention has a subsequent 
20 which encodes a polyadenylation signal and 

polyadenylation site in the transcript and/or a 
subsegment which provides a transcription termination 
signal for transcription from the promoter used in the 
expression cassette according to the invention (the term 
25 "expression cassette" as used herein and throughout the 
specification and claims refers to a DNA sequence which 
includes sequences functional for expression and the 
secretion processes) . The entire transcription 
terminator is taken from a P, pastoris protein-encoding 
30 gene, which may be the same or different from the 

pastoris gene which is the source of the Es. pastoris 
promoter used according to the invention. 

The DNA fragments according to the invention 
further comprise a selectable marker gene. For this 
35 purpose, any selectable marker gene functional in 

JU. pastoris may be employed, i.e., any gene which confers 
a phenotype upon £, pastoris cells thereby allowing them 
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to be identified and selectively grown from among a vast 
majority of untrans formed cells. Suitable selectable 
marker genes include, for example, selectable marker 
systems composed of an auxotrophic mutant P*. pastoris 
5 host strain and a wild type biosynthetic gene which 

complements the host's defect* For transformation of 
his4" P;_ pastoris strains, for example, the c erevisia e 
or Pa. pastoris HIS 4 gene, or for transformation of arg4 te 
mutants the cerevisiae ARG4 gene or the pastoris* 

10 ARG4 gene, may be employed. 

The term "expression vector" includes vectors 
capable of expressing DNA sequences contained therein, 
where such sequences are in operational association with 
other sequences capable of effecting their expression, 

IB i.e. promoter sequences. In general, expression vectors 
usually used in recombinant DNA technology are often in 
the form of "plasmids", i.e. circular, double-stranded 
DNA loops which, in their vector form, are not bound to 
the chromosome. In the present specification the terms 

go "vector 11 and "plasmid" are used interchangeably. 

However, the invention is intended to include other forms 
of expression vectors as well, which function 

equivalently. 

In the DNA fragment according to the invention 

25 the segments of the expression cassette are "in 

operational association". The DNA sequence encoding CD4 
or any portion thereof as hereinabove defined, is 
positioned and oriented functionally with respect to the 
promoter, the DNA encoding a signal sequence, preferably 

30 the cerevisiae AMF pre-pro sequence, and the DNA 

sequence encoding AMF processing-site, lys-arg and the 
transcription terminator, so that the polypeptide 
encoding segment is transcribed, under regulation of the 
promoter region, into a transcript capable of providing, 

35 upon translation the desired polypeptide in P^ pastoris. 

Because of the presence of the signal sequence, e*g. the 
AMF pre-pro sequence, the expressed product, CD4 or a 
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portion thereof, as hereinabove defined, is found as a 
secreted entity in the culture medium, properly processed 
away from the AMF pre-pro sequence. Appropriate reading 
frame positioning and orientation of the various segments 
of the expression cassette are within the knowledge of 
persons of ordinary skill in the art; further details are 

given in the Examples. 

The DNA fragment provided by the present 
invention may include sequences allowing for its 
replication and selection in bacteria, especially iU 
coli. in this way, large quantities of the DNA fragment 
can be produced by replication in bacteria. 

The term "culture" means a propagation of cells 
in a medium conductive to their growth, and all sub- 
cultures thereof. The term "subculture- refers to a 
culture of cells grown from cells of another culture 
(source culture) , or any subculture of the source 
culture, regardless of the number of subculturings which 
have been performed between the subculture of interest 

go. and the source culture. 

The following abbreviations are used throughout 

the Examples with the following meanings: 

DTT dithiothreitol 

SDS sodium dodecyl sulfate 

25 PBS phosphate buffered saline 

Mut + methanol utilization competent 

Muf methanol utilization defective 

Tris-HCl: 1.5M at P H 8.8 and 0.5M at pH6.8. Both are 

stored at 4*C. 

The buffers and solutions, the composition of 
which is not specified in the Examples, are as follows: 
WBB: lx PBS , 0.05% Tween-20, 0.02% NaK 3 , 0.25% 

gelatin. 

Laemmli buffer: see Nature 227, 680 (1970) 


IP. 
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9. fiftneral Methods 

Methods of transforming Plchia pastoris. as well 
as methods applicable for culturing P^. pastoris cells 
containing in their genome a gene for a heterologous 
5 protein are known generally in the art. 

According to the invention, the expression 
cassettes are generally transformed into the P^ pastoris 
cells by the whole-cell lithium chloride yeast 
transformation system [Ito et al.., Agric. Biol. Chem. 48, 
10 341 (1984)], with minor modification necessary for 
adaptation to P. pastoris. Alternatively, the 
spheroplast technique, described by Cregg et al., 

MoT. c.oM. Biol. 5., 3376 (1985) can also be used for the 
transformation of L. pastoris cells. The whole-cell 
15 lithium chloride method is more convenient in that it 
does not require the generation and maintenance of 
spheroplasts. 

Positive transformants are characterized by 
Southern blot analysis [ E. M. Southern, J. Mol. Biol. 
Z0 98_, 503 (1975) ; Maniatis et al. . Molecular Cloning: A 
Laboratory Manual, Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, New York, USA (1982)3 for the site of 
DNA integration, and Northern blots [Alwine et al,.., Proc, 
Nat-1. Acad. Sci. PSA 74, 5350 (1977) ; Maniatis, Pp. Cit..] 
25 for methanol-responsive heterologous gene expression. 
Total RNA for Northern blot analysis was prepared 
essentially as described by Zitomer et al., 
,T. Biol. Chem. 251, 6320 (1976). 

Nick translation can be performed according to 
30 Meinkoth et al. . Methods in Enzvmology 152, 91 (1987) . 

•Transformed strains, which are of the desired 
phenotype and genotype are grown in fermentors. For the 
large-scale production of recombinant DNA-based products 
in P^. pastoris a three-stage, high cell-density, batch 
35 fermentation system is normally employed. In the first, 
or growth stage expression hosts are cultured in defined 
minimal medium with excess glycerol as carbon source. On 
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this carbon source heterologous gene expression is 
completely repressed, which allows .the generation of cell 
mass in the absence of heterologous protexn expression 
Next, a period of glycerol limitation growth is allowed 
to further increase cell density. Subsequent to the 
glycerol limited growth, methanol is added, initiating 
the expression of the desired heterologous protein. Thxs 
third stage is the so-called production stage. The 
fermentation of CD4-V, essentially followed ^three- 
stage protocol described in Digan ^U, mn^m^ 
7 160 (1989). However, as shown in the Examples and xn 
the description of preferred embodiments, in order to 
obtain a stable product, the P H had to be maintained at a 
lower level than usual for Eicjjia pistgEis fermentatxons. 

According to a preferred embodiment of the 
present invention, the V, region of the human CD4 molecule 
is produced in PicMa pastpris- This V, region contaxns a 
single disulfide bond between two cysteine resxdues, 
which are located near the N- and C-termini, 

respectively. 

The heterologous protein expression system used 

for CD4-V, production preferably utilizes the promoter 
derived from the methanol -regulated mi gene of 
£_ pastoris, which is very efficiently expressed and 
tightly regulated. This gene is the source of the 
transcription terminator as well. The expression 
cassette preferably comprises, in operational 
association, a P, ^sfepxis AS*! promoter, DNA encodxng 
the cereyisiae AMF pre-pro sequence, a DNA sequence 
encoding AMP processing site, lys-arg, a DNA sequence 
encoding the CD4-V, molecule, and a transcription 
terminator derived from the P. EastoriS MXl gene. 

The host cells to be transformed with a linear 
vector comprising the expression cassette are BasfeS^ 
cells having at least one mutation that can be 
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complemented with a marker gene present on a transforming 
DNA fragment. Preferably his^ (6S115) auxotrophic mutant 
P. pastoris strains are employed. 

The expression cassette is inserted into a 
5 plasmid containing a marker gene complementing the host's 
defect. pBR322-based plasmids, e.g. pAOS15 are 
preferred. Plasmid pA0815 comprising the CD4-V, 
expression/secretion cassette is called pSCD103. The 
construction of this plasmid is disclosed in 
10 Example 1. 

To develop expression strains, the expression 
cassette is preferably integrated into the host genome 
by means of the homologous sequences present on the 
transforming DNA. The expression cassette or entire 
15 vector is integrated into the host genome by a one-step 
gene replacement or addition technique. This approach 
avoids the problems of plasmid instability. As a result 
of gene replacement Mut" strains are obtained. Mut refers 
to the methanol-utilization phenotype. In Muf strains, 
20 the AQX1 gene is replaced with the expression cassette, 
thus decreasing the transf ormant ' s ability to utilize 
methanol. A slow growth rate on methanol is maintained 
by expression of the A0X2 gene product. The 
transformants in which the expression cassette has 
25 integrated into the A0X1 locus by site-directed 

recombination can be identified by first screening for 
the presence of the complementing gene. This is 
preferably accomplished by growing the cells in a media 
lacking the complementing gene product and identifying 
30. those cells which are able to grow by nature of 

expression of the complementing gene. Next, the selected 
cells are screened for their Mut phenotype by growing 
them in the presence of methanol and monitoring their 
growth rate. 

35 To develop Mut* strains, the expression cassette 

preferably is integrated into the host genome by 
transformation of the GS115 host with SacI linearized 
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plasmid PSCD103 comprised of the V, expression cassette. 
The integration is by addition at a locus or loci having 
homology with one or more sequences present on the 

transformation vector. 

Positive transformants are characterized by 
southern analysis for the site of DNA integration, by 
Northern analysis for methanol -responsive CD4-V, gene 
expression, and by immunoblot product analysis for the 
presence of secreted CD4-V, in the growth media. 
P. pastoris strains which have integrated one or multiple 
copies of plasmid at a desired site are identified by 
southern blot analysis. Strains which demonstrate 
enhanced expression of the heterologous gene may be 
identified by Northern analysis, and enhanced secretion 
of the recombinant protein by product analysis. 

For Mut" strains the CD4~V 1 production levels 
were found to be somewhat lower than for Mut + ^trains, but 
the difference was not very significant. Mut + pastoyis 
strains integrating multiple copies of the expression 
vector (or of the AMF~V 1 expression cassette) used for 
transformation, at the AQ2Q locus are preferred, since an 
increase in copy number often increases productivity. 

E± pastoris transformants which are identified 
to have the' desired genotype and phenotype are grown in 
fermentors. Typically a three-step production process is 
used, initially, cells are grown on a repressing carbon 
source, preferably excess glycerol. In this stage the 
cell mass is generated in absence of expression. Next, a 
period of glycerol limitation growth is allowed, and then 
a limiting methanol feed is initiated, resulting in the 
expression of the V, gene driven by the h$£L promoter. 

It has been found that in the usual pH-range 
used for heterologous protein production in ^ pastor^ 
(about pH 5.0) the V, product suffers a substantial 
proteolytic degradation. To avoid or, at least, reduce 
product degradation, fermentation is preferably performed 
at pHs below about 3.5, preferably between about pH 2.5 
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and 3.5, more preferably between about pH 2.5 and 3.0, 
for example at about pH 2.6. The pH can be adjusted to 
the desired value by methods known in the art, preferably 
before the induction of V, production. 

The level of CD4-V,, secreted into the media can, 
for example, be determined by quantitative Western blot 
analysis of the media in parallel with a standard (e.g. 
an E-_ coli produced V 1 standard) , using reducing or non- 
reducing conditions. 

The invention is further illustrated by the 

following non-limiting examples. 


4, Examples 

Example 1 

Vector construction 
T. Construction of the expre ssion vector PSCP103 

The expression vector construction disclosed in 
the present application was performed using standard 
procedures, as described, for example in Maniatis et al. , 
Supra, and Davis et al. , Basic Methods in Molecular 
Biology , Elsevier Science Publishing, Inc. , New York 
(1986) . 

A 2.2 kb linear Bglll-Nhel DNA fragment 
containing a segment encoding the V, portion of human CD4 
accompanied by flanking DNA from coli, was obtained 
from Smith Kline & French Laboratories (U.S.A.) The V,- 
encoding sequence was excised from this fragment by 
digestion with EcoRI and Jtbal, and isolating the 482 bp 
fragment (the sequence encoding amino acids 1-106 of 
mature CD4 along with its leader sequence) on a 1.3% 
agarose gel (Figure 2) . Fifty nanograms of the 482 bp 
fragment were ligated to 100 ng of the plasmid pIBI25, 
previously cut with EcoRI and 3toal. Plasmid pIBI25 was 
purchased from IB1, New Haven, CT, and contains an fl 
origin of replication and the T7 promoter. E. coli 
MC1061 cells (M.J. Casadaban and S.N. Cohen, Ji Molt. 
Biol. 38, 179 (1980) ] were transformed with ligation 
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products and amp« colonies were selected. Correct placid 
demonstrated a 477bp band upon digestion with EcoRI and 

Xbal and was called pSCD4. 

The AMF pre-pro sequence was isolated from 
Ml3m P 19aMF pre-pro by digesting with BOX and fiatfBI and 
isolating the about 267 bp fragment on a 1.3% agarose 
gel To prepare plasmid M13mpl9aMF, 15 Mg of plasmid 
PAO208 (the construction of which is described 
hereinafter) were digested with tftflX. 
Klenow-fragment DNA polymerase, and dxgested with EfiARX. 
The digestion was run on a 1.7% agarose gel and the 267 
bp fragment comprised of the AMF pre-pro sequence was 
isolated. The hEGF (human epidermal growth factor) gene 
and the AMF pre-pro sequence in the same translational 
direction were inserted into M13m P 19, (New England 
Biolabs), by the following procedure: , 
3.0 fiq of M13mpl9 were digested with Mai and 
EcoRI and the large, about 7240 bp plasmid fragment was 
plated on a 0.8% agarose gel. The plasmid fragment and 
the 267 bp AMF fragment were ligated together by T4 DNA 
ligase. The ligation mixture was transformed into JM103 
cells and DNA from the plague was characterized. The 
correct plasmid was called Ml3mpl9aMF. 

Twenty five nanograms of the EcoRI-BamHI 
fragment of Ml3m P 19<*MF pre-pro were ligated to 100 ng of 
PIBI25 previously cut with EcoRI and BamHI, and the 
ligation products were transformed into MC1061 cells. 
top « colonies were selected and the correct plasmid was 
identified by digestion with EcqRI and BamHI. The 
correct plasmid demonstrated a 260 bp band, and was 

called pAMFlOl (Figure 3) . 

pSCD4 was digested with EcoRI, made blunt-ended 
by treatment with Klenow fragment of £*. coli DNA 
polymerase I, and then digested with Xbal. The 477 bp V 
encoding fragment was isolated on a 1.2% agarose gel. 
PAMF101 was likewise digested with BamHI, treated with 
Klenow fragment ofLMi DNA polymerase I, digested 
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with Xbalr and then dephosphory lated . Fifty nanograms of 
the 477 bp V,,-encoding fragment were ligated to 100 ng of 
the linearized vector, and the ligation was transformed 
into Ejj. coli CJ236 cells (BioRad, Richmond, CA; Muta-gene 
5 mutagenesis kit, # 170-3571) . Amp R colonies were 

selected. The correct plasmid exhibited a 740 bp band 
upon digestion with EcoRI and Xfeal and was called pSCDlOl 
(Figure 3) . 

Mutagenesis was performed to fuse the AMF pre- 
10 pro sequences directly to the V, coding region; the STE2 
processing sites (glu-ala-glu-ala) of the AMF pre-pro 
sequence and the native CD4 leader sequence were 
eliminated by the oligonucleotide-directed mutagenesis. 
Single-stranded pSCDlOl template was prepared following 
15 the procedure of Russel et al . , Gene 45, 333 (1986) , 
using the helper phage R408. The mutagenizing and 
screening oligonucleotide was of the following sequence: 
5« GGG TAT CTT TGG ATA AAA GAA AGA AAG TGG 
TGC TGG GCA A 3* 

2Q Mutagenesis reaction products were transformed 

into MC1061 cells; colonies transformed with the 
mutagenized plasmid were first identified by 
hybridization with the screening oligonucleotide, and 
then the correct mutagenesis was confirmed by sequencing. 
25 The correctly mutagenized plasmid was called pSCD102. 

An EcoRI linker was added to the 3' end of the 
AMF pre-pro-V t insert by digesting pSCDl02 with JTbal, 
blunt-ending with E*. coli DNA polymerase I Klenow 
fragment, and ligating 100 ng of the vector to 15 ng of 
30 EcoRI linkers having the sequence: 

5' GGAATTCC 3« 
The ligation products were digested with EcoRI and the 
560 bp AMF pre-pro^ fragment was isolated on a 1.2% 
agarose gel. Twenty nanograms of the 560 bp fragment 
35 were then ligated to 100 ng of FiCoRI -digested and 

phosphatase-treated pA0815 (the construction of which is 
described hereinbelow) . The ligation products were 
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transformed into MC1061 cells and the amp" colonxes were 
selected. The correct plasmid demonstrated a 1675 bp 
hand upon digestion with PstI, and was called pSCD103 
(Figure 3) . The nucleotide and amino acid sequence of 
5 the EcoRI insert of P SCD103 is shown in Figure 4. 

jj Constructioii of pl^smid pAO209j. 

The AOX1 transcription terminator was xsolated 
from 20 ^ of PPG2.0 [PPG2.0 - fiamHI-SindXII fragment of 
10 pG4.0 (NRRL 15868) + pBR322] by StUl digestion followed 

l Y the addition of 0.2 m linkers (GGTCGACC) . The 

plasmid was subsequently digested with Hindlll and the 
350 bp fragment isolated from a 10% acrylamide gel and 
subcloned into pUCIS (Boehringer Mannheim) digested wxth 
HindHI and Sail. The ligation mix was transformed xnto 
^3 cells (that are widely available) and amp* colonxes 
were selected. The correct construction was verxfied by 
HindHI and Sail digestion, which yielded a 350 bp 
fragment, and was called pA020l. 

5 fig of pA0201 was digested with HindHI , 
filled in using EL. fifili DNA Polymerase I Klenow fragment, 
and 0.1 m of BglH liters (GAGATCTC) were added. After 
digestion of the excess figtfl linkers, the plasmid was 
reclosed and transformed into MC1061 cells. Amp cells 
25 were selected, DNA was prepared, and the correct plasmxd 
was verified by B9UI, fiftl? double digests, yielding a 
350 bp fragment, and by a HindHI digest to show loss of 
HindHI site. This plasmid was called pA0202. 

The alpha factor-GRF fusion was isolated as a 
3j> 360 bp M HI~PstI partial digest from pYSV201. Plasmid 
PYSV201 is the EcoRI-BamHI fragment of GRF-E-3 inserted 
into M13mpl8 (New England Biolabs) . Plasmid GRF-E-3 is 
described in EP 206,783. 20 m of PYSV201 plasmid was 
digested with BjyaHI and partially digested with T« 
35 this partial digest was added the following 
oligonucleotides: 


20 
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5' AATTCGATGAGATTTCCTTCAATTTTTACTGCA 3» 
3 1 GCTACTCTAAAGGAAGTTAAAAATG 5 ' . 
Only the antisense strand of the oligonucleotide was 
kinase labelled so that the oligonucleotides did not 
5 polymerize at the 5' -end. After acryl amide gel 

electrophoresis (10%), the fragment of 385 bp was 
isolated by electroelution . This EcoRI- BamHI fragment 
of 385 bp was cloned into pA0202 which had been cut with 
EcoRI and BamHI. Routinely, 5 ng of vector cut with the 
10 appropriate enzymes and treated with calf intestine 

alkaline phosphatase, was ligated with 50 ng of the 
insert fragment. MC1061 cells were transformed, amp r 
cells were selected, and DNA was prepared. In this case, 
the resulting plasmid, pA0203, was cut with EcoRI and 
15 Bglll to yield a fragment of greater than 700 bp. The o- 
factor-GRF fragment codes for the (1-40) leu z7 version of 
GRF and contains the processing sites lys-arg-glu-ala- 
glu-ala. 

The AOXl promoter was isolated as a 1900 bp 
20 EcoRI fragment from 20 m of pAOP3 and subcloned into 
EcoRI -digested pA0203. The development of pA0P3 is 
disclosed in EP 226,846 and described hereinbelow. 
MC1061 cells were transformed with the ligation reaction 
amp r colonies were selected, and DNA was prepared. The 
25 correct orientation contains a «376 bp Hindlll fragment, 
whereas the wrong orientation has an »675 bp fragment. 
One such transformant was isolated and was called pA0204. 

The parent vector for pA0208 is the HIS4, PARS2 
plasmid pYJ32 (KRRL B-15891) which was modified to change 
30 the EcoRV site in the tet" gene to a SgJLII site, by 

digesting PYJ32 with EcoRV and adding Bglll linkers to 
create pYJ32 (+BglII) . This plasmid was digested with 
Bglll and the 1.75 Kb Bglll fragment from pA0204 
containing the AOXl promoter-a factor GRF-A0X1 3 1 
15 expression cassette was inserted. The resulting vector 
was called pA0208. The orientation was verified by an 
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Ecp.RI digest yielding an 850 bp fragment + vector, as 
opposed to 1.1 Kb + vector in the other orientation, 
a. retraction of plasmid pAOP3: 

1. Plasmid pPG2.5 [a pBR322 based plasmid 

5 containing the approximately 2.5 Kbp EcoRl-Sall fragment 
from plasmid pPG4.0, which plasmid contains the primary 
alcohol oxidase gene (AQXl) and regulatory regions and 
which is available in an L cpli host from the Northern 
Regional Research Center of the United States Department 
10 of Agriculture in Peoria, Illinois as NRRL B-15868] was 

linearized with Bam HI. 

2. The linearized plasmid was digested with 

BAL31; 

3. The resulting DNA was treated with £*. S2AA 
15 DNA Polymerase I Klenow fragment to enhance blunt ends, 

and ligated to EcoRI linkers? 

4. The ligation products were transformed 

into Es. coli strain MM294; 

5. Transformants were screened by the colony 
20 hybridization technique using a synthetic oligonucleotide 

having the following sequence: 

5 1 TTATTCGAAACGGGAATTCC . 
This oligonucleotide contains the A0J& promoter sequence 
up to, but not including, the ATG initiation codon, fused 
25 to the sequence of the EcoRI linker; 

6 . positive clones were sequenced by the 
Maxam-Gilbert technique. All three positives had the 
following sequence: 

5 ' . . . TTATTCGAAACGA.GGAATTCC . . . 3 ' . 

3jt They all retained the »A'« of the ATG (underlined in the 
above sequence) . It was decided that this A would 
probably not be detrimental; thus all subsequent clones 
are derivatives of these positive clones. These clones 
have been given the laboratory designation pAOPl, pAOP2 

35 and pAOP3 respectively. 
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III, Construction of plasmid PA0815: 

Plasmid pA08l5 was constructed by mutagenizing 
plasmid pA0807 (described hereinbelow) to change the Clal 
site downstream of the A0X1 transcription terminator in 
5 pA0807 to a BamH I site. The oligonucleotide used for 

rautagenizing pA0807 had the following sequence: 5» GAC 
GTT CGT TTG TGC GGA TCC AAT GCG GTA GTT TAT 3 1 „ The 
mutagenized plasmid was called pA0807-Bam. Plasmid 
pA0804 was digested with BglH and 25 ng of the 2400 bp 
10 fragment were ligated to 250 ng of the 5400 bp Bg&II 

fragment from Bglll-digested pA0807-Bam. The ligation 
mix was transformed into MC1061 cells and the correct 
construct was verified by digestion with Pst/BamHI to 
identify 5700 and 2100 bp sized bands. The correct 
15 construct was called pA0815. The restriction map of the 
expression vector pA0815 is shown in Figure 5, 

a, Plasmid pA0807 was constructed as follows: 
1, Preparation of fl-ori DNA 
fl bacteriophage DNA (50 fig) was digested with 
20 50 units of E§a I and Dra I (according to manufacturer's 
directions) to release the **458 bp DNA fragment 
containing the f 1 origin of replication (ori) . The 
digestion mixture was extracted with an equal volume of 
phenol: chloroform (V/V) followed by extracting the 
25 aqueous layer with an equal volume of chloroform and 

finally the DNA in the aqueous phase was precipitated by 
adjusting the NaCl concentration to 0.2M and adding 2.5 
volumes of absolute ethanol. The mixture was allowed to 
stand on ice (4°C) for 10 minutes and the DNA precipitate 
30 was collected by centrifugation for 30 minutes at 10,000 
x g in a microfuge at 4*C. 

The DNA pellet was washed 2 times with 70% 
aqueous ethanol. The washed pellet was vacuum dried and 
dissolved in 25 jil of TE buffer [1.0 itiM EDTA in 0.01 M 
35 (pH7.4) Tris buffer]. This DNA was electrophoresed on 

1.5% agarose gel and the «458 bp fl~ori fragment was 
electroeluted onto DE81 (Whatman) paper and eluted from 
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the paper in 1M Had. The DNA solution was precipitated 
as detailed above and the DNA precipitate was dissolved 
in 25 pi of TE buffer (fl-ori fragment) . 

2. Cloning of fl-ori into Dra I sites of 

5 pBR322 

pBR322 (2 jig) was partially digested with 2 
units pra I (according to manufacturer's instructions). 
The reaction was terminated by phenol: chloroform 
extraction followed by precipitation of DNA as detailed 

10 in step 1 above. The DNA pellet was dissolved in 20 pi 
of TE buffer. About 100 ng of this DNA was ligated with 
100 ng of fl-ori fragment (step 1) in 20 pi of ligation 
buffer by incubating at 14 *C for overnight with 1 unit of 
T4 DNA ligase. The ligation was terminated by heating to 

15 70 'C for 10 minutes and then used to transform E_-_ S°ii 

strain JM103 [Janisch-Perron et al ? ., fiene. 22, 103 
(1983)]. Amp* transforraants were pooled and 
superinfected with helper phage R408 tassel et 
SuP ra 1 . Single stranded phages were isolated from the 

20 media and used to reinfect JM103. Amp" transformants 
contained pBRfl-ori 

which contains fl-ori cloned into the pra I sites 
(nucleotide positions 3232 and 3251) of pBR322. 

25 3. Construction of plasmid pA0807 

pBRfl-ori (10 jig) was digested for 4 hours at 
37 »C with 10 units each of Rst I and Nile I. The digested 
DNA was phenol: chloroform extracted, precipitated and 
dissolved in 25 pi of TE buffer as detailed in step 1 

30 above. This material was electrophoresed on a 1.2% 

agarose gel and the Nde. I - Pgfc I fragment (approximately 
O.s kb) containing the fl-ori was isolated and dissolved 
in 20 Ml of TE buffer as detailed in step 1 above. About 
100 ng of this DNA was mixed with 100 ng of pA0804 

15 (described hereinafter) that had been digested with Pgt I 
and Nde I and phosphatase-treated. This mixture was 
ligated in 20 pi of ligation buffer by incubating for 
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overnight at 14 *C with 1 unit of T4 DNA ligase. The 
ligation reaction was terminated by heating at 70 *C for 
10 minutes. This DNA was used to transform E* coli 
strain JM103 to obtain pA0807. 
5 ^ piasmid pA0S04 employed in the above 

procedure was constructed as follows: 

Piasmid pBR322 was modified as follows to 
eliminate the EcoRI site and insert a Bglll site into the 
Fvu II site: 

10 pBR322 was digested with EcoRI , the protruding 

ends were filled in with Klenow Fragment of coli DNA 
polymerase I, and the resulting DNA was ^circularized 
using T4 ligase. The ^circularized DNA was used to 
transform E* coli MC1061 to ampicillin-resistance and 

15 transformants were screened for having a piasmid of about 
4*37 kbp in size without an EcoR I site* One such 
transformant was selected and cultured to yield a 
piasmid, designated pBR322ARI, which is pBR322 with the 
EcoRI site replaced with the sequence: 

20 5 ■ -GAATTAATT03 1 

3 1 -CTTAATTAAG-5 1 . 

PBR322aRI was digested with PyuII and the 
linker, of sequence 

25 5 1 -CAGATCTG-3 1 

S'-GTCTAGAC-S* 
was ligated to the resulting blunt ends employing T4 
ligase. the resulting DNAs were recircularized, also 
with T4 ligase, and then digested with Bglll and again 

30 recircularized using T4 ligase to eliminate multiple 

Sglll sit£s due to ligation of more than one linker to 
the FvuXI-cleaved pBR322aRI* The DNAs, treated to 
eliminate multiple Bcrill sites, were used to transform E* 
coli MC1061 to ampicillin-resistance. Transformants were 

35 screened for a piasmid of about 4*38 kbp with a BglXI 

site. One such transformant was selected and cultured to 
yield a piasmid, designated pBR3 22 aRI BGL , for further 
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work. Plasmid P BR322aKIBGL is the same as p bR322aRI 
except that P BR322aRIBGL has the sequence 
5 ' -CAGCAGATCTGCTG-3 1 
3 ' -GTCGTCTAGACGAC-5 ' 
5 in place of the PvuII site in pBR322aRI. 

p BR322aRIBGL was digested with a Sail and Bgill 
and the large fragment (approximately 2.97 kbp) was 
isolated. Plasmid pBSAGISI, which is described in 
European Patent Application Publication No. 0,226,752, 
10 was digested completely with Bglll and Xhol and an 
approximately 850 bp fragment from a region of the 
P^ pastoris AOX1 locus downstream from the AOX1 gene 
transcription terminator {relative to the direction of 
transcription from the A0X2 promoter) was isolated. The 
15 BglK-SJo 1 fragment from pBSAGI5l and the approximately 
2.97 kbp, Sall-Sglll fragment from p BR322aRIBGL were 
combined and subjected to ligation with T4 ligase. The 
ligation mixture was used to transform E. coli MC1061 
cells to ampicillin-resistance and transformants were 
20 screened for a plasmid of the expected size 

(approximately 3.8 kbp) with a Bglll site. This plasmid 
was designated pAOSOl. The overhanging end of the Sail 
site from the p BR322aRIBGL fragment was ligated to the 
overhanging end of the Xhol site on the 850 bp pBSAGI5I 
25 fragment and, in the process, both the Sail site and the 
Xho l site in pAOSOl were eliminated. 

PBSAGI5I was then digested with Cl&I and the 
approximately 2.0 kbp fragment was isolated. The 2.0 kbp 
fragment has an approximately 1.0 -kbp segment which 
30 comprises the P. pastoris AOXJL promoter and transcription 
initiation site, an approximately 700 bp segment encoding 
the hepatitis B virus surface antigen ("HBsAg") and an 
approximately 300 bp segment which comprises the 
P^ pastoris AOXl gene polyadenylation signal and site- 
35 encoding segments and transcription terminator. The 
HBsAg coding segment of the 2.0 kbp fragment is 
terminated, at the end adjacent the 1.0 kbp segment with 
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the AOX1 promoter, with an EcoRI site and, at the end 
adjacent the 300 hp segment with the A0%1 transcription 
terminator with a StuI site, and has its subsegment which 
codes for HBsAg oriented and positioned, with respect to 
5 the 1.0 kbp promoter-containing and 300 bp transcription 
terminator-containing segments, operatively for 
expression of the HBsAg upon transcription from the AOX1 
promoter. The EcoRI site joining the promoter segment to 
the HBsAg coding segment occurs just upstream (with 

10 respect to the direction of transcription from the Ml 

promoter) from the translation initiation signal-encoding 
triplet of the AOX1 promoter. 

For more details on the promoter and terminator 
segments of the 2.0 kbp, CI a I -s i t e- t ermi nated fragment of 

15 pBSAGISI, see European Patent Application Publication No. 

0,226,846 and Ellis et al* , Mpl« Cell Biol, 1111 
(1985)* 

Plasmid pAOBOl was cut with Clal and combined 
for ligation using T4 ligase with the approximately 2.0 

20 kbp Clal-site-terminated fragment from pBSAGIBI- The 

ligation mixture was used to transform E_^ coli MC1061 to 
ampicillin resistance, and transf ormants were screened 
for a plasmid of the expected size (approximately 5*8 
kbp) which, on digestion with Cla l and Bglll, yielded 

25 fragments of about 2.32 kbp (with the origin of 

replication and ampicillin-resistance gene from pBK322) 
and about 1.9 kbp, 1.48 kbp, and 100 bp. On digestion 
with Bcrlll and EcoRI, the plasmid yielded an 
approximately 2.48 kbp fragment with the 300 bp 

30 terminator segment from the AOX . l gene and the HBsAg 

coding segment, a fragment of about 900 bp containing the 
segment from upstream of the AOX1 protein encoding 
segment of the AOX1 gene in the A0%1 locus, and a 
fragment of about 2.42 kbp containing the origin of 

3£ replication and ampicillin resistance gene from pBR322 
and an approximately 100 bp ClaX-Bglll segment of the 
A0X1 locus (further upstream from the AOXl-encoding 


WO 91/05057 


PCT/US90/05520 


33 


segment than the first mentioned 900 bp EcoRI-iglH 
segment) . Such a plasmid had the Cla.1 fragment from 
PBSAGI5I in the desired orientation, in the opposite 
undesired orientation, there would be EcoRI-Bglll 
5 fragments of about 3.3 kbp, 2.38 kbp and 900 bp. 

0ne 0 f the transformants harboring the desired 
plasmid, designated P A0802, was selected for further work 
and was cultured to yield that plasmid. The desired 
orientation of the Clal fragment from pBSAGISI in pA0802 

10 had the AQX1 gene in the AOX1 locus oriented correctly to 
lead to the correct integration into the £^ pastoris 
genome at the mi ^cus of linearized plasmid made by 
cutting at the Bglll site at the terminus of the 800 bp 
fragment from downstream of the AQXl gene in the AQX1 

15 locus. 

P A0802 was then treated to remove the HBsAg : 
coding segment terminated with an EcoRI site and a StuI 
site. The plasmid was digested with SMI and a linker of 
sequence t 

20 

5 ' -GGAATTCC-3 ' 
3 1 -CCTTAAGG-5 ' 
was ligated to the blunt ends using T4 ligase. The 
mixture was then treated with IcoRI and again subjected 
25 to ligating using T4 ligase. The ligation mixture was 

then used to transform SL coii MC1061 cells to ampicillin 
resistance and transformants were screened for a plasmid 
of the expected size (5.1 kbp) with B^RI-BslII fragments 
of about 1.78 kbp, 900 bp, and 2.42 kbp and Sg^I-Clal 
30 fragment of about 100 bp, 2.32 kbp, 1.48 kbp, and 1.2 

kbp. This plasmid was designated pA0803. a transf ormant 
with the desired plasmid was selected for further work 
and was cultured to yield P A0803. 

Plasmid pA0804 was then made from pAO 80 3 by 
M inserting, into the BamHI site from pBR322 in pA0803, an 
approximately 2.75 kbp Bt&H fragment from the 

pastoris HIS4 gene. See, e.g., Cregg et al. ; , M&L. 
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Cell. Biol. 5, 3376 (X985) and European Patent 
Application Publication Nos. 0,180,899 and 0,188,677, 
pA0803 was digested with BamHI and combined with the HIS4 
gene-containing BglXI site-terminated fragment and the 
5 mixture subjected to ligation using T4 ligase. The 

ligation mixture was used to transform E*. £oli MC1061 
cells to ampicillin-resistance and trans formants were 
screened for a plasmid of the expected size (7,85 kbp) , 
which is cut by Sail. One such transformant was selected 
lo for further work, and the plasmid it harbors was 
designated pA0804* 

pA0804 has one Sall-Clal fragment of about 1.5 
kbp and another of about 5.0 kbp and a £lal~£lal fragment 
of 1,3 kbp; this indicates that the direction of 
15 transcription of the HIS4 gene in the plasmid is the same 

as the direction of transcription of the ampicillin 
resistance gene and opposite the direction of 
transcription from the A0X1 promoter* 

The orientation of the HIS4 gene is pA0804 is 
20 not critical to the function of the plasmid or of its 
derivatives with cDNA coding segments inserted at the 
EcoRI site between the AOX1 promoter and terminator 
segments. Thus, a plasmid with the HIS4 gene in the 
orientation opposite that of the HIS4 gene in pA0804 
25 would also be effective for use in accordance with the 
present invention. 
Example 2 

strain develo pment and charac terization 
Plasmid pSCDl03, the construction of which is 
30 described in Example 1, was used to develop Mut* and Hut" 
strains of pastoris . The His~ strain GS115 (ATCC 
20864) was the host for all transformations «> 
Transformations were accomplished by the whole-cell LiCl 
method [Ito et al. , J. Bacteriol. 153(1), 163 (1983)], 
35 with minor modification necessary for adaptation to 
P A pastoris . 
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To develop Mut + strains, pSCD103 was digested 
with Baal, which linearizes the vector within the AOX1 
promoter region, and 10 m of the linearized vector were 
used to transform GS115. Histidine prototrophs Were 
5 selected. 

To develop Muf strains, pSCD103 was digested 
with Belli thereby liberating an expression cassette 
comprised of the AOX1 promoter region, oMF leader-V, gene, 
Aoxi transcription termination signals, HIS£ gene for 
10 selection, and AOX1 3' region. Both ends of this 

expression cassette contain long sequences which are 
homologous to the 5- and 3' ends of the ffi locus. 10 pg 
of the linearized vector were used to transform GS115 
cells. Histidine prototrophs were selected and screened 
for the Muf phenotype by replica plating colonies from 
glucose containing media to methanol containing media, 
and evaluating growth rate on methanol. Slow growth on 
methanol was indicative of the Muf phenotype. Several 
His + Mut* colonies were identified. 

To characterize the Mut + and Muf transf ormants 
for cassette copy number and site of integration, DNA 
from several of the selected colonies was digested with 
EcoRI and probed with nick-translated pSCD103. The 
Southern analysis yielded the following information: 


15 


20 


25 


30 


35 


Steals name Mut^ Copy Site °* 

Number integration 

G+SCD103S03 Muf one AQXJL 

G+SCD103S16 Mut* two AQX1 

G-SCD103S03 Muf one AOXI (disruption) 

Kvam ple 3 

f AT-mentat j on in tw n -1 iter fermentors 
-, v 0 n„ Pn tnr start-u p and general operation 
The 2-liter fermentors (Biolafitte, LSL 
Biolafitte, Princeton, NJ) were autoclaved at a volume of 
one liter containing 5X Basal Salts (21 ml/1 85% 
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phosphoric acid, 0.9 g/1 Calcium Sulfate x 2H 2 0, 14,3 g/1 
Potassium Sulfate, 11.7 g/1 Magnesium sulfate x 7H20, 
3.25 g/1 Potassium Hydroxide) and 5% (w/v) glycerol. 
After sterilization, 5 ml of a PTl^ trace salts solution 

5 (6.0 g/1 Cupric Sulfate x 5H 2 0, 0*08 g/1 Sodium Iodide, 

3.0 g/1 Manganese Sulfate x H^O, 0.2 g/1 Sodium Molybdate 
x 2H 2 0, 0.02 g/1 Boric Acid, 0.5 g/1 Cobalt Chloride, 20 
g/1 Zinc Chloride, 65 g/1 Ferrous Sulfate x 7H 2 0, 0.2 g/1 
Biotin, and 5.0 ml of Sulfuric Acid) was added, and the 

10 pH of the fermentor was adjusted to 3.0 with the addition 
of concentrated Ammonium Hydroxide. During the 
fermentation, the pH was controlled at 3.0 with the 
addition of a 50% (v/v) Ammonium Hydroxide solution. 

The fermentors were then inoculated with 50 ml 

15 of an overnight culture grown in 6.75 g/1 Difco yeast 

nitrogen base, 2% glycerol, 0.1 M potassium phosphate, pH 
6.0. After 16 hours of fermentor growth, the pH of the 
medium was dropped to 2.6 and the cells continued to grow 
in a batch mode to exhaust the original charge of 

20 glycerol. Upon glycerol exhaustion, a 50% (w/v) glycerol 
feed containing 12 ml/1 FTi^ trace salts was initiated at 
a feed rate of 5 to 20 ml/h. After 200 ml of the 
glycerol feed was added into the fermentor, a 100 % 
methanol feed containing 12 ml/1 PTM^ trace salts, was 

25 initiated at 1 ml/h, and the glycerol feed was shut off 
after 1 hour of methanol feeding. After 4 hours of 
methanol feeding, the methanol feed was increased to 5-6 
ml/h over an 8-12 hour period and was maintained at this 
rate for the remainder of the fermentation. The 

30 dissolved oxygen concentration was maintained above 20% 
of saturation by adjusting 

agitation and aeration as needed. The temperature was 
controlled at 30 *C and foaming was controlled by the 
addition of a 5% solution of Struktol J-673 antifoam 
35 (Struktol Co. , Stow, OH) . 

Before harvesting the fermentor, the pH was 
decreased to 2.5 with the addition of 85 % phosphoric 
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acid The contents were then centrifuge* to remove cells 
and the supernatant was filter-sterilized through a 
0.22 fi Corning filter (Corning Glass Co., Corning, NY). 
The supernatant was then frozen at -20 'C. 
5 b. Growth, of Mut* and Mnf strains 

Run 568: G+SCD103S03 

Run 570: G+SCD103S16 

Run 571: G-SCD103S03 

Run 585: G+SCD103S03 
10 Run 593: G4SCD103S16 

~~ Fermentation Runs 568, 570, 571, 585, and 593 

were conducted as described above, except that Runs 568, 
570 and 571 were conducted at pH 5.0; Run 585 was 
performed at P H 3.5 and the pH was not adjusted to P H 2.5 

M at the end of the fermentation run; Run 593 was conducted 
as hereinabove described. 

Figure 6 shows the time course for cell yield 
for one-liter fermentation runs with strains G+SCD103SO3 
(Run 568) , G+SCD103S16 (Run 570) , and G-SCD103S03 (Run 

20 571) . cell yield was calculated as the mass of wet cells 
per liter of broth after centrifugation. A conversion 
factor of 0.25 was used to calculate yield of dry cells 
per liter. 

The single-copy Mut + and Muf strains grew at 
equivalent rates, whereas the two-copy Muf strain showed 
slightly decreased cell yield on methanol. However, 
because Eijchia transformants carrying multiple copies of 
an expression cassette may express higher levels of 
heterologous proteins in the fermentor than do the 
strains with a single-copy, another fermentation, Run 593 
was conducted to analyze the level of recombinant CD-V, m 
the broth of the two-copy strain, G+SCD103S16. 

Figures 7A and 7B show the time course for cell 
yield and CD4-V, production, respectively,, £or 
fermentation Run 593. The expression level was . 
determined for unfiltered, reduced broth samples, and was 
estimated by quantitative Western blot analysis to be 130 


25 


35 
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mg/liter after 71 hours on methanol. The level was 
continuing to increase when the fermentor was harvested. 

Example 4 

5 Analysis of secreted CD4~V t 

a* Western blot analysis 

The V 1 region of CD4 contains a single disulfide 
bond between two cysteine residues, which are located at 
positions 16 and 84 of the mature CD4, near the N- and C- 
10 termini, respectively. Therefore, non-reduced V, 

molecules from fermentor broth samples will co-migrate 
with the V, standard, regardless of whether the molecule 
has been nicked between the cysteines. On the other 
hand, reduced samples will only co-migrate with the 
15 standard if the peptide bonds between the cysteines are 
intact* separating pastoris broth samples on non- 
reducing gels yielded a quantitative measurement of the 
total amount of V 1 contained in the fermentor broth, while 
reducing gels yielded the amount of intact V r 
20 Fermentor broth samples from Runs 571 (Mut~) , 

568 and 585 (single copy Mut + ) and 593 (multicopy Mut + ) 
were analyzed by Western blotting. Ten microliters of 
each sample were mixed with an equal volume of 2x Laemmli 
sample buffer containing 200 mM DTT (+DTT) • In some 
25 cases, the 2x sample buffer lacked any reducing agent 
( ^DTT) ♦ Fermentor samples and 2X sample buffer were 
mixed and immediately boiled for 5 minutes (+DTT) , or 
mixed and immediately placed at room temperature until 
the gel was loaded (-DTT) . Samples thus prepared were 
30 separated by electrophoresis, at 4>C, on 15% SDS-PAGE 
gels, at 150V constant voltage, until the bromophenol 
blue tracking dye had reached the bottom of the gel* 

coli produced V 1 (Smith Kline & French) , used 
as standard, was treated in an identical manner, and 
35 separated on the same gels as a standard control. 

Reduced (+DTT) and non-reduced (**DTT) samples were 
separated on different gels. For quantitation of the 
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10 


15 


20 


25 


15 


P nagtsris produced V, in fermentor broth, fermentor 
lam^ere -parated in non-adjacent lanes, to prevent 
spillover errors, and several different amounts of V, 
standards were separated on the same gel to generate an 

internal standard curve. 

Gels were transblotted to nitrocellulose 
pore size) for 90 minutes at 4'C, using a carbonate 
buffer system [S.D.Dunn, *nft1 ■ Biochem. 152, 144 (1986)]. 
The filters were blocked for 16 hours at room temperature 
in western blocking buffer (WBB) , incubated for two hours 
with a 1:1000 dilution of rabbit anti-sCD4 [SK&F; Arthos 
et al., cell 57, 469 (1989)] in WBB at room temperature, 
washed four-times for 15 minutes in WBB, incubated for 
one hour at room temperature in a 1:5000 dilution of low 
specific activity «x-Prof in A (New England Nuclear) in 
WBB, washed four-times for 15 minutes each time in WBB, 
air dried and exposed to Kodak X-omat film at -70°C, with 
two intensifying screens. V t bands, identified by 
reaction with anti-sCD4, were excised from the 
nitrocellulose filters and quantitated using a gamma- 
counter. The fermentor samples were quantitated by 
comparison with the V, standard curve on each filter, 
results of these analyses are summarized in the following 


The 


Table: 









Y._YIEkD {mqm. 


RUN 

MUT+/- 

pH 

TOTAL 

INTACT 

571 

Muf 

5.0 

100 

0 


568 

single copy Mut* 

5.0 

100 

0 

0 
25 

585 

single copy Mut* 

3.5 

125 

30 

570 

two-copy Mut* 

5.0 

100 

0 

0 

593 

two-copy Mut* 

2.6 

100 

100 

100 


b. amino Acid «Agtiencinq 

We have further characterized the E ic h ia- 
produced recombinant CD4-V 1 by determining the N-terminal 
sequence of the protein which comigrates with the V, . 
standard on reducing SDS-PAGE. As shown in the silver- 
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stained reducing gel of V 1 standard and fermentor broth 
samples pictured in Figure 8, control fermentor broth 
obtained from the fermentation of Pichia strain G-PA0815 
does not contain a protein which comigrates with the V, 
5 standard. In contrast, Pichia rCD4-V t appears to be the 
major lower molecular weight protein species present in 
broth samples from the fermentation of strain 
G+3CD103SX6. To ensure that broth components did not 
affect the migration or staining of V 1 in polyacrylamide 
10 gels, coil-derived V 1 standard and rCD4 -^-containing 
broth from fermentation of G+SCD103S16 were separately 
mixed with Pichia control broth and analyzed by reducing 
SDS-PAGE. As shown in Figure 8, the electrophoretic 
characteristics of the V, standard and Pichia-produced 
15 rCD4-V t were unaltered by exposure to Pichia control 

broth; the V 1 standard and Pichia rCD4-V, co-migrated to 
the same gel position and exhibited similar staining 
properties in the presence and absence of control broth* 
The first 15 residues of Pichia rCD4»V 1 were determined , 
20 and found to be identical to the published sequence 

(Naddon et al. , Supra ) for mature human CD4-V, (Figure 9). 
From this result, it was concluded that the N-terminus of 
the recombinant CD4-V, was correctly processed from the 
aMF leader. 
25 c. Stability 

The stability of the V, molecules in the 
fermentor broth, which had been adjusted to pH 2.5, was 
analyzed in two ways* First, the stability of V t during 
storage was analyzed by subjecting identical broth 
30 samples to a freeze-thaw cycle, followed by incubation of 
the samples for 20 hours, under varying conditions 
(Figure 10, lanes 2-4). No change was observed in the 
amount of immunoreactive material or the proportion of 
intact V, in the different samples. In the second 
35 experiment, the pH of the broth samples was raised to 5,0 
and the samples incubated under the same conditions as 
before (Figure 10, lanes 5-7). As seen in the Figure, 
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some broadening of the intact V, band occurred under 
conditions. Therefore, while the rCD4-V, is stable i 
pH 2.5 broth samples, some degree of proteolytic 
degradation may occur in samples at elevated pHs. 
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CLAIMS: 

1. A Pichia pastoris (P*. pastoris) cell 
containing in its genome at least one copy of a DNA 
sequence operably encoding in P^ pastoris at least a 

5 portion of human CD4 glycoprotein, containing the site of 
interaction between CD4 and the human immunodeficiency 
virus (HIV) , in operational association with a DNA 
sequence encoding a signal sequence which functions to 
direct secretion of said human CD4 glycoprotein or a 
10 portion thereof in pastoris , both under the regulation 
of a promoter region of a P^. pastoris, gene. 

2. AL pastoris cell according to Claim 1, 
wherein said signal sequence-encoding DNA comprises a DNA 
sequence encoding the S^. cerevisiae AMF pre-pro sequence, 

15 and a DNA sequence encoding AMF processing-site lys-arg. 

3. a Pa. pastoris cell according to Claim 2, 
wherein said P^. pastoris gene is the pastoris A0X1 
gene. 

4. AL pastoris cell according to Claim 3 

20 wherein said DNA sequence operably encodes in P^. pastoris 
the V 1 region of human CD4 glycoprotein. 

5. AL pastoris cell according to Claim 4 
containing at least two copies of said DNA sequences. 

6. A Pj. pastoris cell containing in its 
25 genome at least one copy of an expression cassette 

comprising in the direction of transcription, a promoter 
region of a first P. pastoris gene, a DNA sequence 
operably encoding in 2-. pastoris at least a portion of 
human CD4 glycoprotein, containing the site of 

30 interaction between CD4 and the HIV virus, preceeded by a 

DNA sequence encoding a signal sequence directing the 
secretion of said glycoprotein or a portion thereof in P*. 
pastoris , and a transcription terminator of a second P^. 
pastoris gene, said first and second P_s. pastoris genes 

£5 being identical or different, the segments of said 

expression cassette being in operational association. 
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7. AL pastoris cell according to Claim 6, 
wherein said signal sequence-encoding DNA comprises a DNA 
sequence encoding the S, cerevisiae AMF pre-pro sequence 
and a DNA sequence encoding AMF processing-site lys-arg. 

8. A Pi. pastoris cell according to Claim 7 
wherein said first and second £i. pastoris genes are 
identical and are the ^ p astor is AOXI gene. 

9. A pastoris cell according to Claim 8 
wherein said DNA sequence operably encodes in P^ pastoris 

10 the V, region of human CD4 glycoprotein. 

10. A L pastpris cell according to Claim 9 
containing at least two copies of said expression 
cassette* 

11. A P_a. pastpris cell according to Claim 10, 
1§ containing two copies of said expression cassette 

integrated by addition at the AQXl locus of said P^ 

pastoris genome. 

12. A L pastoris cell according to Claim 9, 
containing a single copy of said expression cassette 

20 integrated by addition at the A02a locus of said P^ 

pastoris genome. 

13. A £*. pastpris cell according to Claim 9, 
containing a single copy of said expression cassette 
integrated by gene replacement at the fflCl locus of said 

25. Pi. pastoris genome. 

14. A DNA fragment optionally contained 
within, or which is, a circular plasmid comprising at 
least one copy of an expression cassette comprising in 
the direction of transcription, a promoter region of a 

30 first ^ pastoris gene, a DNA sequence operably encoding 
~~ in P*. pastoris at least a portion of human CD4 

glycoprotein, containing the site of interaction between 
CD4 and the HIV virus, preceeded by a DNA sequence 
encoding a signal sequence directing the secretion of 
35 said glycoprotein or a portion thereof in ■£*. p asfroris, 
and a transcription terminator of a second JU ESStsrlS 
gene, said first and second P^ pastpris genes being 
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identical or different, the segments of said expression 
cassette being in operational association* 

15* A DNA fragment according to Claim 14, 
wherein said signal sequence-encoding DNA comprises a DNA 
5 sequence encoding the cerevisiae AMF pre-pro sequence, 

and a DNA sequence encoding AMF processing~site lys-arg. 

16- A DNA fragment according to Claim 15, 
wherein said first and second pastoris genes are 
identical and are the pastoris M2E1 gene, 
10 17. A DNA fragment according to Claim 16 

wherein said DNA sequence operably encodes in P*. pastoris 
the V 1 region of human CD4 glycoprotein, 

18. A DNA fragment according to Claim 16, 
further comprising a selectable marker gene and ends 

15 having sufficient homology with a target gene to effect 
integration of said DNA fragment therein, 

19. A DNA fragment according to Claim 18, 
wherein said target gene is the £^ pastoris AQX1 gene, 

20. A DNA fragment according to Claim IB which 
20 is a Bgl ll digest of the expression vector pSCD103* 

21. A DNA fragment according to Claim 18, 
which is a Sac I digest of the expression vector pSCD103. 

22. An expression vector containing at least 
one copy of an expression cassette comprising in the 

25 direction of transcription , a promoter region of a first 
pastoris gene, a DNA sequence operably encoding in P. 
pastoris at least a portion of human CD4 glycoprotein, 
containing the site of interaction between CD4 and the 
HXV virus, preceeded by a DNA sequence encoding a signal 

30 sequence directing the secretion of said glycoprotein or 
a portion thereof in pastoris , and a transcription 
terminator of a second pastoris gene, said first and 
second £^ pastoris genes being identical or different, 
the segments of said expression cassette being in 

.35 operational association. 
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23. An expression vector according to Claim 

22, wherein said signal sequence-encoding DNA comprises a 
DNA sequence encoding the ce^evisiae AMF pre-pro 
sequence, and a DNA sequence encoding AMF processing-site 

5 lys-arg. 

24. An expression vector according to Claim 

23, further comprising sequences allowing for its 
replication and selection in bacteria. 

25. An expression vector according to Claim 

M 24, which is a pBR322 derivative. 

26. An expression vector according to Claim 

25, which is the Pichi a expression vector pSCD103. 

27 . a culture of viable Pu. pastcrjdi cells 
according to any one of Claims 1 to 13. 

3^5 28. A process for producing and secreting at 

least a portion of human CD4 glycoprotein, containing the 
site of interaction between CD4 and the HIV virus, into 
the culture medium comprising growing R*. pastoriS 
transformants containing in their genome at least one 

2$ copy of a DNA sequence operably encoding in pastoris 
at least a portion of human CD4 glycoprotein, containing 
the site of interaction between CD4 and the HIV virus, in 
operational association with a DNA sequence encoding a 
signal sequence directing the secretion of said 

25 glycoprotein or a portion thereof in pastoris., both 
under the regulation of a promoter region of a £*. 
pastoris gene, under conditions allowing the expression 
of said DNA sequences in said pasigxis and secretion 
of said glycoprotein or a protein thereof into the 

3jD culture medium in a substantially pure form, 
substantially devoid of degradation products. 

29. A process according to Claim 28, wherein 
said signal sequence is the ^ cer^visiag AMF pre-pro 
sequence. 

30, A process according to Clairc 29, wherein 
said transformants are developed from the pastpris 
his4~ strain GS115. 


35 
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31. A process according to Claim 30 , wherein 
said transformants have the Mut + phenotype. 

32* A process according to Claim 28, which 

comprises : 

5 a. growing said pastoris transformants on a 

medium containing repressing carbon source to generate 
cell mass in absence of heterologous gene expression, 

b. continuing growth under glycerol limitation 
conditions, and 

10 c. initiating heterologous gene expression by 

adding methanol to the medium, and keeping the pH at or 
below about 3,5 during said heterologous gene expression. 

33. A process according to Claim 32, wherein 
the pH is kept between about 2.5 and about 3.5 during 

15 heterologous gene expression. 

34. A process according to Claim 33, wherein 
the pH is kept between about 2*5 and about 3.0 during 
heterologous gene expression. 

35. A process according to any one of Claims 
20 28 to 34, further comprising the step of harvesting said 

human CD4 glycoprotein or a portion thereof from the 
culture medium. 

36. A process for producing a heterologous 
protein in pastoris , wherein the pH of the culture 

25 medium is maintained at or below about 3.5 during 
heterologous gene expression. 

37. A process according to Claim 36, wherein 
said heterologous protein is secreted into the 
fermentation medium. 

30 38. A process according to Claim 36, wherein 

the pH is maintained between about 2.5 and about 3.5. 

39. Substantially pure human CD4 glycoprotein 
or a portion thereof containing the site of interaction 
between CD4 and the human immunodeficiency virus (HIV) 

35 produced in yeast. 

40. Substantially pure human CD4 glycoprotein 
or a portion thereof according to Claim 39 produced in 
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CAAGCCCAGAGCCCTGCCATTTCTGT6G6CTCAGGTCCCTACT6CTCA6CCCCTTCCTCC 

-20 , , 

met asn arg gly val pro phe arg his leu leu 
CTCGGCAAGGCCACA ATG AAC CGG GGA GTC CCT TTT AGG CAC TTG CTT 108 

leu val leu gin leu ala leu leu pro ala ala thr gin gly asn 
CTG GTG CTG CAA CTG GCG CTC CTC CCA GCA GCC ACT CAG GGA AAC 

+10 * . 

ivs val val leu gly lys lys gly asp thr val glu leu thr cys 
AAA GTG GTG CTG GGC AAA AAA GGG GAT ACA GTG GAA CTG ACC TGT 198 

+20 + 30 
thr ala ser gin lys lys ser ile gin phe his trp lys asn ser 
ACA GCT TCC CAG AAG AAG AGC ATA CAA TTC CAC TGG AAA AAC TCC 

+40 

asn aln ile lvs ile leu gly asn gin gly ser phe leu thr lys 

AAC CAG ATA AAG ATT CTG GGA AAT CAG GGC TCC TTC TTA ACT AAA 288 

+50 +6 ° 
gly pro ser lys leu asn asp arg ala asp ser arg arg ser leu 
GGT CCA TCC AAG CTG AAT GAT CGC GCT GAC TCA AGA AGA AGC CTT 

+70 

tro aso gin gly asn phe pro leu ile ile lys asn leu lys ile 

TGG GAC CAA GGA AAC TTC CCC CTG ATC ATC AAG AAT CTT AAG ATA 378 

+80 * + 90 

glu asp ser asp thr tyr ile cys glu val glu asp gin lys glu 
GAA 6AC TCA GAT ACT TAC ATC TGT GAA GTG GAG GAC CAG AAG GAG 

+100 

glu val gin leu leu val phe gly leu thr ala asn ser asp thr 

GAG GTG CAA TTG CTA GTG TTC GGA TTG ACT GCC AAC TCT GAC ACC 468 

+110 + 120 
his leu leu gin gly gin ser leu thr leu thr leu glu ser pro 
CAC CTG CTT CAG GGG CAG AGC CTG ACC CTG ACC TTG GAG AGC CCC 

+130 * ' . , 

w & ffi he ® & m m u m m m in 55 8 

FIG. 11-1 
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+140 
asn ile gin 
AAC ATA CA6 


+150 

jly gly lys thr leu ser val ser gin leu 
5GG ggg aag acc ctc tcc gtg tct cag ctg 

+160 


Jiu leu 
5AG CTC 


gin asp ser gly thr trp thr cys thr val leu gin asn gin lys 
CAG GAT AGT GGC ACC TGG ACA TGC ACT GTC TTG CAG AAC CAG AAG 

+170 +180 
lys val glu phe lys ile asp ile val val leu ala phe gin lys 
AAG GTG GAG TTC AAA ATA GAC ATC GTG GTG CTA GCT TTC CAG AAG 


+190 

ala ser ser ile val tyr lys lys glu 
GCC TCC AGC ATA GTC TAT AAG AAA GAG 


3ly 


jiu 

ii A A 


3ln val 
IAG GTG 


jiu Phe 
m TTC 


+200 +210 
ser phe pro leu ala phe thr val glu lys leu thr gly ser gly 
TCC TTC CCA CTC GCC TTT ACA GTT GAA AAG CTG ACG GGC AGT GGC 

+220 

glu leu trp trp gin ala glu arg ala ser ser ser lys ser trp 
GAG CTG TGG TGG CAG GCG GAG AGG GCT TCC TCC TCC AAG TCT TGG 


+240 

3lu val ser val lys arg val 
3 AA GTG TCT GTA AAA CGG GTT 


+230 

ile thr phe asp leu lys asn lys 
ATC ACC TTT GAC CTG AAG AAC AAG 

+250 

thr gin asp pro lys leu gin met gly lys lys leu pro leu his 
ACC CAG GAC CCT AAG CTC CAG ATG GGC AAG AAG CTC CCG CTC CAC 

+260 +270 
leu thr leu pro gin ala leu pro gin tyr ala gly ser gly asn 
CTC ACC CTG CCC CAG GCC TTG CCT CAG TAT GCT GGC TCT GGA AAC 

+280 

leu thr leu ala leu glu ala lys thr gly lys leu his gin glu 
CTC ACC CTG GCC CTT GAA GCG AAA ACA GGA AAG TTG CAT CAG GAA 


+290 

val asn leu val val met an 
GT6 AAC CTG GTG GTG ATG AG/ 


+300 

ala thr gin leu gin lys asn leu 
GCC ACT CAG CTC CAG AAA AAT TTG 


thr cys glu val trp 
ACC TGT GAG GTG TGG 


+310 


3ly pro thr ser pro lys leu met leu ser 
jGA CCC ACC TCC CCT AAG CTG ATG CTG AGC 

FIG. 11™2 
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+320 


+330 


jen \vi ipii alii asn ivs alu ala lys val ser lys arg glu lys 

TTG AAA CTG GAG AAC AAG GAG GCA AAC GTC TCG AAG CGG GAG AAG 

+340 * 

ala val trp val leu asn pro glu ala gly met trp gin cys leu 

6C6 GTG TGG 6TG CTG AAC CCT GAG GCG GGG ATG TGG CAG TGT CTG 1188 


ien QPr^asn qpr alv aln val leu leu glu ser asn ile lys val 
CTG A6T GAC TCG GGA CAG GTC CTG CTG GAA TCC AAC ATC AAG GTT 

+ 370 

1pm nrn thr tro ser thr pro val gin pro met ala leu lie yal 

CTG CCC ACA TGG TCC ACC CCG GTG CAG CCA ATG GCC CTG ATT GTG 1278 

+380 +390 

Bt m m $ & m m $ $ ffi in m m m m 

+400 , , , 

phe phe cys val arg cys arg his arg arg arg gin ala glu arg 
TTC TTC TGT GTC AGG TGC CGG CAC CGA AGG CGC CAA GCA GAG CGG 1368 

+410 +Zt20 ■ 

mpt cof aln Hp ivs arg leu leu ser glu lys lys thr cys gin 
ATG TCT CAG ATC AAG AGA CTC CTC AGT GAG AAG AAG ACC TGC CAG 

+430 

fa m cac m w 8A8 in as f» m w H? ™ ggcacga ^ 

GGCCAGGCAGATCCCACTTGCAGCCTCCCCAGGTGTCTGCCCCGCGTTTCCTGCCTGCGG 
ACCAGATGAATGTAGCAGATCCCAC6CTCTGGCCTCCTGTTCGTCCTCCCTACAATTTG 1578 
CCATTGTTTCTCCTGGGTTAGGCCCCGGCTTCACTGGTTGAGTGTTGCTCTCTAGTTTCC 
AGAGGCTTAATCACACCCTCCTCCACGCCATTTCCTTTTCCTTCAAGCCTAGCCCTTCT 1697 
CTCATTATTTCTCTCTGACCCTCTCCCCACTGCTCATTTGGATCC 1742 

FIG. 11-3 
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